Download paginated API data to a CSV

Python

Author

Affiliation

Sandy Rogers

MGnify team at EMBL-EBI

This is a static preview

You can run and edit these examples interactively on Galaxy

Fetch paginated data from the MGnify API, and save it as a CSV file

The MGnify API returns paginated data. When you list data, it comes to you in pages, or chunks. You have to request each page in turn. The jsonapi_client package can do this for you, automatically.

This example shows you how to download a paginated list of data and save it to a CSV table file

You can find all of the other “API endpoints” using the Browsable API interface in your web browser. The URL you see in the browsable API is exactly the same as the one you can use in this code.

This is an interactive code notebook (a Jupyter Notebook). To run this code, click into each cell and press the ▶ button in the top toolbar, or press shift+enter.

We pick an API endpoint for the kind of data to download:

from lib.variable_utils import get_variable_from_link_or_input

# You can also just directly set the api_endpoint variable in code, like this:
# api_endpoint = 'super-studies'

api_endpoint = get_variable_from_link_or_input('API_ENDPOINT', 'API Endpoint', 'super-studies')

Using API Endpoint super-studies from the link you followed.

Using "super-studies" as API Endpoint

Use jsonapi_client to go through the paginated data. Note that this may take quite a long for long lists, because the API automatically slows down your connection if you request a lot of data. This keeps the service working well for everybody else.

We use pandas, an excellent library for data analysis, to normalise the data into a table.

from jsonapi_client import Session
import pandas as pd

with Session("https://www.ebi.ac.uk/metagenomics/api/v1") as mgnify:
    resources = map(lambda r: r.json, mgnify.iterate(api_endpoint))
    resources = pd.json_normalize(resources)
    resources.to_csv(f"{api_endpoint}.csv")
resources

	type	id	attributes.super-study-id	attributes.title	attributes.url-slug	attributes.description	attributes.image-url	attributes.biomes-count
0	super-studies	1	1	Tara Oceans	tara-oceans	The Tara Oceans expedition (Karsenti et al. 20...	data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...	0
1	super-studies	2	2	Earth Microbiome Project	earth-microbiome-project	The Earth Microbiome Project is now available ...	data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...	0
2	super-studies	3	3	NASA GeneLab Microbiome (MANGO)	nasa-genelab-microbiome-mango	Project MANGO provides access to the microbiom...	data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...	0
3	super-studies	4	4	HoloFood	holofood	Holistic approach to improve the efficiency of...	data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...	2
4	super-studies	5	5	Malaspina	malaspina	The Malaspina circumnavigation expedition was ...	data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...	0
5	super-studies	6	6	AtlantECO	atlanteco	The EU-funded AtlantECO project aims to develo...	data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...	0
6	super-studies	7	7	FindingPheno	findingpheno	FindingPheno is creating an integrated computa...	data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...	7
7	super-studies	8	8	National Mouse Genetics Network (NMGN) Microbi...	nmgn-microbiome	The Microbiome Cluster of the National Mouse G...	data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...	0
8	super-studies	9	9	MICROBE	MICROBE	MICROBE paves the way for an innovative microb...	data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...	2